Skip to content

refactor(BA-5713): collapse scheduler signatures to owner_id#11047

Closed
jopemachine wants to merge 7 commits into
refactor/BA-5650-D-session-repo-owner-idfrom
refactor/BA-5650-E-scheduler-owner-id
Closed

refactor(BA-5713): collapse scheduler signatures to owner_id#11047
jopemachine wants to merge 7 commits into
refactor/BA-5650-D-session-repo-owner-idfrom
refactor/BA-5650-E-scheduler-owner-id

Conversation

@jopemachine
Copy link
Copy Markdown
Member

@jopemachine jopemachine commented Apr 14, 2026

📚 Stack

Merge from top to bottom. Intermediate slices may not build standalone; the final refactor tip is #11051, and #11040 is the schema-drop follow-up on top of it.

Summary

scheduler/predicates.py: predicates resolve the owner's main_access_key via helper; references to SessionRow.user_uuid throughout. scheduler/drf.py: fairness keyed by (owner_id, main_access_key). repositories/scheduler/options.py: drop duplicated by_access_key_* factories. repositories/scheduler/types/*: rename access_keymain_access_key on ScheduledSessionData, TerminatingSessionData, SweptSessionInfo, KernelEnqueueData, SessionEnqueueData. events/stream db_source: owner-UUID sub-select shim.

Resolves BA-5713. Part of epic BA-5650.


BA-5650 Series: Split Rationale

Overall goal: migrate the session owner identifier from access_key (keypair) to owner_id (user UUID), and drop the sessions.access_key / kernels.access_key columns.

Split criteria: layer + dependency order. Bottom-up (DB helpers → service → API) so the destructive column drop can land safely at the end.

Order Issue Layer Role
1 BA-5709 Foundation Add get_main_access_key_by_id and related resolver helpers (everything else depends on this)
2 BA-5710 RBAC / Permission Rename UserPermission.user_uuid → owner_id; add main_access_key field
3 BA-5711 Data Rename SessionData.user_uuid → owner_id; Row adapters; GQL node
4 BA-5712 Repository Collapse SessionRepository, SessionDBSource, creators signatures
5 BA-5713 Scheduler predicates / drf / options; scheduler repo; stream/events shims
6 BA-5714 Sokovan allocation / lifecycle / workload, launcher, controller, executor
7 BA-5715 Service Drop owner_id from 21 read/control Actions; resolve via current_user()
8 BA-5716 API (v1) Remove owner_access_key from REST v1 DTOs (breaking)
9 BA-5717 Tests / Cleanup Test suite updates; remaining ORM / gql_legacy touch-ups
10 BA-5653 Schema Alembic migration — rename/drop columns (destructive, must land last)

Why this split

  • Dependencies: the BA-5709 helpers are required before 3–7 can recover access_key from owner_id. BA-5653 (destructive) only runs once every reader has migrated off the dropped column.
  • Review size: a single-shot change would touch hundreds of files. Slicing by layer keeps each PR focused on one concern.
  • Rollback safety: steps 1–7 are no-op on external behavior, step 8 is the API breaking change, step 10 is the schema drop. Each step is independently revertible.

@github-actions github-actions Bot added size:L 100~500 LoC comp:manager Related to Manager component labels Apr 14, 2026
@jopemachine jopemachine changed the title refactor(BA-5650-E): collapse scheduler signatures to owner_id refactor(BA-5713): collapse scheduler signatures to owner_id Apr 14, 2026
@jopemachine jopemachine marked this pull request as draft April 14, 2026 07:30
@jopemachine jopemachine force-pushed the refactor/BA-5650-D-session-repo-owner-id branch from 719ff4b to f811d03 Compare April 14, 2026 07:39
@jopemachine jopemachine force-pushed the refactor/BA-5650-E-scheduler-owner-id branch from 3b7a216 to 18a8ec2 Compare April 14, 2026 07:39
@jopemachine jopemachine requested a review from Copilot April 14, 2026 07:46
Comment thread changes/11047.enhance.md
Comment thread src/ai/backend/manager/repositories/scheduler/types/session.py Outdated
@jopemachine jopemachine force-pushed the refactor/BA-5650-D-session-repo-owner-id branch from f811d03 to c2d0d94 Compare April 14, 2026 07:52
@jopemachine jopemachine force-pushed the refactor/BA-5650-E-scheduler-owner-id branch from 18a8ec2 to 17d3c3a Compare April 14, 2026 07:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is part of the BA-5650 refactor stack, updating scheduler- and scheduler-repository-facing types/queries to key scheduling behavior off owner_id and the owner’s resolved main_access_key (instead of relying on sessions.access_key snapshots).

Changes:

  • Add a helper in scheduler predicates to resolve main_access_key from users via SessionRow.user_uuid, and switch keypair-policy/concurrency lookups to use that value.
  • Update DRF scheduler fairness tracking to be keyed by user UUID instead of AccessKey.
  • Rename several scheduler repository DTO fields from access_keymain_access_key and user_uuidowner_id, and remove redundant by_access_key_* condition factories from scheduler repository options.
  • Add “access_key → owner UUID” shims in stream/events DB sources.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/ai/backend/manager/scheduler/predicates.py Resolve owner main access key for keypair-scoped predicates (concurrency/resource-policy/pending limits).
src/ai/backend/manager/scheduler/drf.py Switch DRF fairness bookkeeping to use user UUID keys.
src/ai/backend/manager/repositories/stream/db_source/db_source.py Resolve owner_id from access_key before fetching streaming sessions.
src/ai/backend/manager/repositories/events/db_source/db_source.py Resolve owner_id from access_key for event session lookups.
src/ai/backend/manager/repositories/scheduler/types/session_creation.py Rename enqueue DTO fields to owner_id / main_access_key.
src/ai/backend/manager/repositories/scheduler/types/session.py Rename scheduler session DTO fields to owner_id / main_access_key (partial).
src/ai/backend/manager/repositories/scheduler/types/results.py Rename scheduled result DTO field to main_access_key.
src/ai/backend/manager/repositories/scheduler/types/allocation.py Update comment to reflect main access key semantics.
src/ai/backend/manager/repositories/scheduler/options.py Remove duplicated access-key string match condition factories; rename user UUID filter factory methods to owner_id naming.
changes/11047.misc.md Changelog entry for the refactor slice.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/ai/backend/manager/repositories/scheduler/types/session.py
Comment thread src/ai/backend/manager/repositories/scheduler/types/session.py
Comment thread src/ai/backend/manager/scheduler/drf.py Outdated
Comment thread src/ai/backend/manager/scheduler/predicates.py
Comment thread src/ai/backend/manager/scheduler/predicates.py
Comment thread src/ai/backend/manager/scheduler/predicates.py
Comment thread src/ai/backend/manager/repositories/scheduler/types/results.py
Comment thread src/ai/backend/manager/repositories/scheduler/types/session.py
Comment thread src/ai/backend/manager/scheduler/predicates.py
@jopemachine jopemachine force-pushed the refactor/BA-5650-D-session-repo-owner-id branch from 2caf306 to c1a5f32 Compare April 14, 2026 08:07
@jopemachine jopemachine force-pushed the refactor/BA-5650-E-scheduler-owner-id branch from 3ece8ba to 81cdac3 Compare April 14, 2026 08:08
Comment thread src/ai/backend/manager/repositories/scheduler/types/allocation.py Outdated
Comment thread src/ai/backend/manager/repositories/stream/db_source/db_source.py Outdated
@jopemachine jopemachine force-pushed the refactor/BA-5650-D-session-repo-owner-id branch from e2ff93b to 765d164 Compare April 14, 2026 08:27
@jopemachine jopemachine force-pushed the refactor/BA-5650-E-scheduler-owner-id branch from 7ab7788 to e10c89d Compare April 14, 2026 08:31
@jopemachine jopemachine force-pushed the refactor/BA-5650-D-session-repo-owner-id branch from 4ecaf01 to 82ea2d0 Compare April 14, 2026 10:16
jopemachine and others added 7 commits April 14, 2026 19:17
``SessionRepository`` and the underlying ``SessionDBSource`` now take
``owner_id: UUID`` on every method that previously accepted
``owner_access_key: AccessKey``. Affects:

- ``get_session_validated``
- ``match_sessions``
- ``update_session_name``
- ``find_dependency_sessions`` / ``_find_dependent_sessions``
- ``get_target_session_ids``
- ``get_session_with_group``

The matching ``dependency_graph`` helpers and ``creators`` are updated
in lockstep. Service-layer callers still pass ``owner_access_key``
temporarily; they will be migrated in the next slice.
Scheduler / predicates / scheduler-type collapse of the owner key:

- ``scheduler/predicates.py``: predicates now take SessionRow and
  resolve ``main_access_key`` from the owner via a helper when a
  keypair-scoped lookup (Redis concurrency, keypair resource policy)
  is required. Renames ``SessionRow.user_uuid`` references throughout.
- ``scheduler/drf.py``: per-user fairness tracking keyed by
  ``owner_id``/``main_access_key`` pair.
- ``repositories/scheduler/options.py``: drop the duplicated
  ``by_access_key_*`` factories — session filters go through
  ``SessionConditions`` helpers instead.
- ``repositories/scheduler/types/*``: rename ``access_key`` to
  ``main_access_key`` on ``ScheduledSessionData``,
  ``TerminatingSessionData``, ``SweptSessionInfo``, ``KernelEnqueueData``
  and ``SessionEnqueueData``.
- ``repositories/events/db_source/db_source.py`` and
  ``repositories/stream/db_source/db_source.py``: resolve the owner
  UUID from ``main_access_key`` via a sub-select shim while the schema
  still exposes ``sessions.access_key``.
- repositories/scheduler/types/session.py: rename
  PendingSessionData.access_key -> main_access_key (and drop the
  outdated resolved-main_access_key comment).
- repositories/scheduler/db_source/db_source.py: update the
  PendingSessionData call site to main_access_key + owner_id keyword
  names matching the dataclass.
- scheduler/drf.py: use existing_sess.user_uuid (SessionRow stores the
  owner UUID there, not owner_id).
- scheduler/predicates.py: guard every _resolve_main_access_key
  consumer (check_concurrency, check_keypair_resource_limit,
  check_pending_session_count_limit, check_pending_session_resource_limit)
  with an early main_ak-is-None return so that NULL main_access_key
  users don't fall through to keypair policy lookups that match with
  NULL.
- repositories/scheduler/types/allocation.py: rename SessionAllocation's
  ``access_key`` field to ``main_access_key`` and drop the stale
  explanatory comment.
- repositories/stream/db_source/db_source.py: raise UserNotFound when
  no user owns the supplied ``main_access_key`` — SessionNotFound was
  misleading since the failure is about the user lookup, not the
  session.
- Drop the leftover ``changes/BA-5650-D.misc.md`` file.
- Scheduler db_source and preparer now use ``main_access_key`` /
  ``owner_id`` on ``SessionEnqueueData``, ``KernelEnqueueData``,
  ``TerminatingSessionData``, ``SweptSessionInfo``, and
  ``ScheduledSessionData``.
- ``PendingSessions.owner_ids`` replaces ``user_uuids`` in db_source
  call sites.
- ``scheduling_controller`` reads ``session_spec.access_key``
  (SessionCreationSpec still uses this name pre-sokovan rename).
- ``PendingSessionData.to_session_workload`` maps
  ``main_access_key`` → ``SessionWorkload.access_key`` and
  ``owner_id`` → ``SessionWorkload.user_uuid`` so the bridge works
  until slice F renames the sokovan SessionWorkload fields.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jopemachine jopemachine force-pushed the refactor/BA-5650-E-scheduler-owner-id branch from e10c89d to b3067da Compare April 14, 2026 10:20
@jopemachine
Copy link
Copy Markdown
Member Author

Consolidated into the merged PR stack (A+B+C+D, E+F, G+H, I)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:manager Related to Manager component size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants